Learning Gaussian Graphical Models with Observed or Latent FVSs
نویسندگان
چکیده
Gaussian Graphical Models (GGMs) or Gauss Markov random fields are widelyused in many applications, and the trade-off between the modeling capacity andthe efficiency of learning and inference has been an important research prob-lem. In this paper, we study the family of GGMs with small feedback vertexsets (FVSs), where an FVS is a set of nodes whose removal breaks all the cycles.Exact inference such as computing the marginal distributions and the partitionfunction has complexity O(kn) using message-passing algorithms, where k isthe size of the FVS, and n is the total number of nodes. We propose efficientstructure learning algorithms for two cases: 1) All nodes are observed, which isuseful in modeling social or flight networks where the FVS nodes often corre-spond to a small number of highly influential nodes, or hubs, while the rest ofthe networks is modeled by a tree. Regardless of the maximum degree, withoutknowing the full graph structure, we can exactly compute the maximum likelihoodestimate with complexity O(kn + n log n) if the FVS is known or in polyno-mial time if the FVS is unknown but has bounded size. 2) The FVS nodes arelatent variables, where structure learning is equivalent to decomposing an inversecovariance matrix (exactly or approximately) into the sum of a tree-structured ma-trix and a low-rank matrix. By incorporating efficient inference into the learningsteps, we can obtain a learning algorithm using alternating low-rank correctionswith complexity O(kn + n log n) per iteration. We perform experiments usingboth synthetic data as well as real data of flight delays to demonstrate the modelingcapacity with FVSs of various sizes.
منابع مشابه
A Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models
The development of accurate models and efficient algorithms for the analysis of multivariate categorical data are important and longstanding problems in machine learning and computational statistics. In this paper, we focus on modeling categorical data using Latent Gaussian Models (LGMs). We propose a novel logistic stick-breaking likelihood function for categorical LGMs that can exploit recent...
متن کاملParameter Estimation in Spatial Generalized Linear Mixed Models with Skew Gaussian Random Effects using Laplace Approximation
Spatial generalized linear mixed models are used commonly for modelling non-Gaussian discrete spatial responses. We present an algorithm for parameter estimation of the models using Laplace approximation of likelihood function. In these models, the spatial correlation structure of data is carried out by random effects or latent variables. In most spatial analysis, it is assumed that rando...
متن کاملLow Complexity Gaussian Latent Factor Models and a Blessing of Dimensionality
Learning the structure of graphical models from data usually incurs a heavy curse of dimensionality that renders this problem intractable in many real-world situations. The rare cases where the curse becomes a blessing provide insight into the limits of the efficiently computable and augment the scarce options for treating very under-sampled, high-dimensional data. We study a special class of G...
متن کاملNonparametric Latent Tree Graphical Models: Inference, Estimation, and Structure Learning
Tree structured graphical models are powerful at expressing long range or hierarchical dependency among many variables, and have been widely applied in different areas of computer science and statistics. However, existing methods for parameter estimation, inference, and structure learning mainly rely on the Gaussian or discrete assumptions, which are restrictive under many applications. In this...
متن کاملLearning Latent Tree Graphical Models
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can...
متن کامل